How to Explore your Opponent's Strategy (almost) Optimally

نویسندگان

David Carmel

Shaul Markovitch

چکیده

This work presents a lookahead-based exploration strategy for a model-based learning agent that enables exploration of the opponent’s behavior during interaction in a multi-agent system. Instead of holding one model, the model-based agent maintains a mixed opponent model, a distribution over a set of models that reflects its uncertainty about the opponent’s strategy. Every action is evaluated according to its long run contribution to the expected utility and to the knowledge regarding the opponent’s strategy. We present an efficient algorithm that returns an almost optimal exploration strategy against a given mixed model, and a learning method for acquiring a mixed model consistent with the opponent’s past behavior. We report experimental results in the Iterated Prisoner’s Dilemma game that demonstrate the superiority of the lookahead-based exploration strategy over other exploration methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learn Your Opponent's Strategy (in Polynomial Time)! Conference Item Learn Your Opponent's Strategy(in Polynomial Time)!

Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online's data policy on reuse of materials please consult the policies page. Abstract Agents that interact in a distributed environment might increase their utility by behaving optimally given the strategies of the other agents. To ...

متن کامل

The Anterior Insula Tracks Behavioral Entropy during an Interpersonal Competitive Game

In competitive situations, individuals need to adjust their behavioral strategy dynamically in response to their opponent's behavior. In the present study, we investigated the neural basis of how individuals adjust their strategy during a simple, competitive game of matching pennies. We used entropy as a behavioral index of randomness in decision-making, because maximizing randomness is thought...

متن کامل

Learn Your Opponent's Strategy (in Polynomial Time)! (an Extended Abstract)

متن کامل

P14: How to Find a Talent?

Talents may be artistic or technical, mental or physical, personal or social. You can be a talented introvert or a talented extrovert. Learning to look for your talents in the right places and building those talents into skills and abilities might take some work, but going about it creatively will let you explore your natural abilities and find your innate talents. You’re not going to fin...

متن کامل

Game Theory of Mind

This paper introduces a model of 'theory of mind', namely, how we represent the intentions and goals of others to optimise our mutual interactions. We draw on ideas from optimum control and game theory to provide a 'game theory of mind'. First, we consider the representations of goals in terms of value functions that are prescribed by utility or rewards. Critically, the joint value functions an...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

How to Explore your Opponent's Strategy (almost) Optimally

نویسندگان

چکیده

منابع مشابه

Learn Your Opponent's Strategy (in Polynomial Time)! Conference Item Learn Your Opponent's Strategy(in Polynomial Time)!

The Anterior Insula Tracks Behavioral Entropy during an Interpersonal Competitive Game

Learn Your Opponent's Strategy (in Polynomial Time)! (an Extended Abstract)

P14: How to Find a Talent?

Game Theory of Mind

عنوان ژورنال:

اشتراک گذاری